Search CORE

8 research outputs found

Relational queries with a tensor processing unit

Author: Holanda P.T. (Pedro)
Mühleisen H.F. (Hannes)
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Tensor Processing Units are specialized hardware devices built to train and apply Machine Learning models at high speed through high-bandwidth memory and massive instruction parallelism. In this short paper, we investigate how relational operations can be translated to those devices. We present mapping of relational operators to TPU-supported TensorFlow operations and experimental results comparing with GPU and CPU implementations. Results show that while raw speeds are enticing, TPUs are unlikely to improve relational query processing for now due to a variety of issues

Crossref

CWI's Institutional Repository

Progressive Mergesort: Merging batches of appends into progressive indexes

Author: Holanda P.T. (Pedro)
Manegold S. (Stefan)
Publication venue
Publication date: 23/03/2021
Field of study

Interactive exploratory data analysis consists of workloads that are composed of filter-aggregate queries with highly selective filters [1]. Hence, their performance is dependent on how much data they can skip during their scans, with indexes being the most efficient technique for aggressive data-skipping. Progressive Indexes are the state-of-the-art on automatic index creation for interactive exploratory data analysis. These indexes are partially constructed during query execution, eventually refining to a full index. However, progressive indexes have been designed for static databases, while in exploratory data analysis updates - usually batch-appends of newly acquired data - are frequent. In this paper, we propose Progressive Mergesort, a novel merging technique to make Progressive Indexes cope with updates. Progressive Mergesort differs from other merging techniques for partial indexes as it incorporates the index budget strategy design from Progressive Indexing. It follows the same three principles as Progressive Indexes: (1) fast query execution, (2) high robustness,(3) guaranteed convergence. Our experimental evaluation demonstrates that Progressive Mergesort is capable of achieving a 2x speedup when merging updates and up to 3 orders of magnitude lower variance than the state of the art

CWI's Institutional Repository

Deep integration of machine learning Into column stores

Author: Holanda P.T. (Pedro)
Manegold S. (Stefan)
Mühleisen H.F. (Hannes)
Raasveldt M. (Mark)
Publication venue
Publication date: 01/01/2018
Field of study

We leverage vectorized User-Defined Functions (UDFs) to efficiently integrate unchanged machine learning pipelines into an analytical data management system. The entire pipelines including data, models, parameters and evaluation outcomes are stored and executed inside the database system. Experiments using our MonetDB/Python UDFs show greatly improved performance due to reduced data movement and parallel processing opportunities. In addition, this integration enables meta-analysis of models using relational queries

CWI's Institutional Repository

Leiden University Scholary Publications

Cracking KD-Tree: The first multidimensional adaptive indexing

Author: Almeida E.C. (Eduardo) de
Holanda P.T. (Pedro)
Manegold S. (Stefan)
Nerone M. (Matheus)
Publication venue: 'Scitepress'
Publication date: 01/01/2018
Field of study

Workload-aware physical data access structures are crucial to achieve short response time with (exploratory) data analysis tasks as commonly required for Big Data and Data Science applications. Recently proposed techniques such as automatic index advisers (for a priori known static workloads) and query-driven adaptive incremental indexing (for a priori unknown dynamic workloads) form the state-of-the-art to build single-dimensional indexes for single-attribute query predicates. However, similar techniques for more demanding multi-attribute query predicates, which are vital for any data analysis task, have not been proposed, yet. In this paper, we present our on-going work on a new set of workload-adaptive indexing techniques that focus on creating multidimensional indexes. We present our proof-of-concept, the Cracking KD-Tree, an adaptive indexing approach that generates a KD-Tree based on multidimensional range query predicates. It works by incrementally creating partial multidimensional indexes as a by-product of query processing. The indexes are produced only on those parts of the data that are accessed, and their creation cost is effectively distributed across a stream of queries. Experimental results show that the Cracking KD-Tree is three times faster than creating a full KD-Tree, one order of magnitude faster than executing full scans and two orders of magnitude faster than using uni-dimensional full or adaptive indexes on multiple columns

Crossref

CWI's Institutional Repository

Multidimensional adaptive & progressive indexes

Author: Almeida E.C. (Eduardo) de
Holanda P.T. (Pedro)
Manegold S. (Stefan)
Nerone M. (Matheus)
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/06/2021
Field of study

Exploratory data analysis is the primary technique used by data scientists to extract knowledge from new data sets. This type of workload is composed of trial-and-error hypothesis-driven queries with a human in the loop. To keep up with the data scientist's productivity, the system must be capable of answering queries in interactive times. Given that these queries are highly selective multidimensional queries, multidimensional indexes are necessary to ensure low latency. However, creating the appropriate indexes is not a given due to the highly exploratory and interactive nature of such human-in-the-loop scenarios.In this paper, we identify four main objectives that are desirable for exploratory data analysis workloads: (1) low overhead over the initial queries, (2) low query variance (i.e., high robustness), (3) predictable index convergence, and (4) low total workload time. Given that not all of them can be achieved at the same time, we present three novel incremental multidimensional indexing techniques that represent three sample points on a Pareto front for this multi-objective optimization problem. (a) The Adaptive KD-Tree is designed to achieve the lowest total workload time at the expense of a higher indexing penalty for the initial queries, lack of robustness, and unpredictable convergence. (b) The Progressive KD-Tree has predictable convergence and a user-defined indexing cost for the initial queries. However, total workload time can be higher than with Adaptive KD-Trees, and per-query time still varies. (c) The Greedy Progressive KD-Tree aims at full robustness at the expense of only improving the per-query cost after full index convergence.Our extensive experimental evaluation using both synthetic and real-life data sets and workloads shows that (a) the Adaptive KD-Tree reduc

CWI's Institutional Repository

Multidimensional Datasets

Author: Holanda P.T. (Pedro)
Publication venue
Publication date: 20/05/2020
Field of study

2 column multid querie

CWI's Institutional Repository

Don’t Keep My UDFs Hostage - Exporting UDFs For Debugging Purposes

Author: Holanda P.T. (Pedro)
Kersten M.L. (Martin)
Raasveldt M. (Mark)
Publication venue
Publication date: 02/10/2017
Field of study

User-defined functions (UDFs) are an integral part of performing indatabase analytics. Executing data analysis inside a database provides significant improvements over traditional methods, such as close-to-the-data execution, low conversion overhead and automatic parallelization. However, UDFs have poor support for debugging. Since they are executed from within the database process, traditional debugging tools such as Integrated Development Environments (IDEs) and Read-Eval-Print Loops (REPLs) cannot be used during development. As a result, writing functional UDFs is challenging. In this paper, we present an extension to the open-source database system MonetDB that allows developers to debug their UDFs using modern debugging techniques

CWI's Institutional Repository

Exílio escravista: Hercule Florence e as fronteiras do açúcar e do café no Oeste paulista (1830-1879)

Author: ALVES Maíra Chinelatto
BACELLAR Carlos de Almeida Prado
BAPTIST Edward E
BROWNLEE Peter John
CAMPOS JUNIOR
CASTRO Antonio Barros
CONRAD Robert
COSTA Emília Viotti da
CUNHA Mauro Rodrigues da
DAVATZ Thomas
DAVIDSON David Michael
DIAS Elaine
DOLHNIKOFF Miriam
FALEIROS Rogério Naques
FERRAZ Lizandra Meyer
FERREIRA Dirceu Franco
FERREIRA Dirceu Franco
FLORENCE Hercule
FLORENCE Leila
FLORENCE Leila
FRAGOSO João Luis Ribeiro
GUIMARÃES Elione Silva
HARDMAN Francisco Foot
HEFLINGER Jr.
HOLANDA Sérgio Buarque
HOLANDA Sérgio Buarque de
HÖRNER Eric
KIDDER Daniel P.
KOSSOY Boris
LAPA José Roberto do Amaral
LUNA Francisco Vidal
MACHADO Maria Helena P.T.
MARCÍLIO Maria Luiza
MARQUESE Rafael de Bivar
MARQUESE Rafael de Bivar
MARQUESE Rafael de Bivar
MARQUESE Rafael de Bivar
MARQUESE Rafael de Bivar
MARTINS Valter
MATOS Odilon Nogueira
MEDICCI Ana Paula
MELLO Pedro Carvalho de
MENDES Felipe Landim Ribeiro
MOTTA José Flávio
PARRON Tâmis
PETRONE Maria Thereza Schorer
PÁDUA José Augusto
Rafael de Bivar Marquese
RIBAS Rogério de Oliveira
RIBEIRO Arilda Inês Miranda
RIBEIRO Maria Alice Rosa
RIBEIRO Maria Alice Rosa
RIBEIRO Maria Alice Rosa
ROSI Bruno Gonçalves
SAES Flávio Azevedo Marques
SALLES Ricardo
SARTRE Jean-Paul
SLENES Robert W
SLENES Robert W
SWEIGART Joseph E
TAUNAY Carlos Augusto
TOPLIN Robert Brent
VAN DELDEN LAËRNE C.F
VOLPATO Luiza Rios Ricci
VON TSCHUDI J. J.
WERNECK Luís Peixoto de Lacerda
WITTER José Sebastião
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/08/2016
Field of study

RESUMO O artigo investiga a trajetória do artista e inventor Antonie Hercule Romuald Florence (1804-1879) na sociedade escravista brasileira do século XIX, procurando examinar os fundamentos do "sentimento de exílio" que marcou sua longa vivência no Oeste de São Paulo. Na primeira parte, trato Florence como um observador das paisagens escravistas do açúcar e do café. A série de desenhos e aquarelas que compôs sobre a fazenda Ibicaba e o engenho da Cachoeira nos permite observar como ele apreendeu os processos concretos de transformação agrária e ambiental da fronteira escravista de São Paulo. Na segunda parte, analiso a conversão de Florence em cafeicultor escravista, momento em ele assumiu por razões familiares a gestão de uma propriedade cafeeira com trinta escravos no município de Campinas

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Cadernos Espinosanos (E-Journal)